Introduction

Scientific research often includes some form of personal data. The amount of personal data that is processed in research is likely underestimated, as researchers may be unaware of what personal data are or whether they are being collected. With the implementation of the General Data Protection Regulation (GDPR) in 2018, stricter legal requirements apply to handling personal data and its sharing and publication. In our own experience, the number and complexity of questions on handling personal data in scientific research at Utrecht University (UU) is increasing.

Our premise at Research Data Management Support (RDM Support) is to assist researchers with any issues surrounding the management of their research data, including research data that contains personal data. To understand how we can best help researchers with their privacy-related questions and needs, we wanted to investigate:

  1. To what extent are UU researchers aware of privacy legislation and practices?
  2. What data privacy issues do UU researchers typically run into?
  3. What support do UU researchers need to handle personal data?

To answer these questions, we set up an online survey and planned one-on-one meetings with a selection of UU researchers. This report describes our methods, results and the future steps that data support at UU can take to improve their services concerning privacy in research.

This survey was part of a larger project at UU, the Data Privacy Project1. This project is a data support effort within UU that aims to provide actionable and FAIR (Findable, Accessible, Interoperable, Reusable) information and tools for researchers to handle personal data in their research.

Methods

Survey questions

The Data Privacy Survey was created by the project coordinators (DH, NM) with input from a wide variety of experts, consisting of privacy experts, data managers, data consultants, and IT staff (e.g., information security, research engineering). The survey consisted of 19-23 questions (the exact number differing depending on given answers), which were separated into 5 sections:

  1. About you and your research: 5 questions about the faculty, department, position, and type of (personal) data the researcher works with.
  2. Measures and documents: 4-6 questions about researchers’ familiarity and use of privacy-related measures (e.g., processing register, Data Privacy Impact Assessment, encryption, pseudonymisation, etc.), data storage, and informed consent forms.
  3. Data sharing, archiving and publication: 3-4 questions about researchers’ data sharing and publishing practices.
  4. Finding support: 3-4 questions about the awareness of several data support channels.
  5. Improving our services: 4 questions about researchers’ challenges, needs, and suggestions to improve UU’s data privacy support.

The estimated 10 minute duration of the survey in advance proved to be relatively accurate: respondents spent a median of 8.4 minutes on the survey (although there was a very large variation; range: 0.4 minutes - 5.1 days).

Procedure online survey

The survey was created in Qualtrics and distributed from March 21st, 2022 onwards via several communication channels to reach as many UU researchers as possible:

  • An email was sent to all academic staff at UU through central communication channels. The reasoning was that this would be the most effective way to reach as many UU researchers as possible - taking for granted that we would likely miss a small part of non-academic personnel also involved in research in some way.
  • A mention in several Faculty newsletters.
  • Social media messages (e.g., on Twitter).
  • A news item on the UU intranet.
  • Via data support colleagues, who were asked to point researchers they were in contact with to the survey.

Researchers could decide voluntarily whether they wanted to participate in the survey. All results reported here are from researchers who provided their active consent, worked at Utrecht University and indicated to work with some type of research data.

Online survey respondents

UU-wide

The survey was filled out by 176 UU researchers. As can be seen in the figure below, we received responses from each UU faculty, but the distribution was not equal: the faculties of Science and Social and Behavioural Sciences were overrepresented, whereas the responses of the Faculty of Geosciences and the Faculty of Medicine were rather low. This can be explained by the mere size of the faculties (e.g., the faculty of Science is UU’s largest faculty), but also by the types of research performed there. For example, research performed at the faculty of Geosciences is largely involved with natural scientific data, rather than data from or relating to humans. The Faculty of Medicine, then, is located at the University Medical Center Utrecht (UMCU) and is not of primary interest for the current survey, which was sent to UU researchers only. A small selection of respondents indicated to work at another part of the organisation, being mainly University College Utrecht and the University Corporate Office.

Most of the survey respondents were relatively early career researchers (PhDs, junior researchers and postdoctoral researchers). We suspect that this is because 1) there simply are more people in these positions than there are in the others, and 2) early-career researchers may experience a greater need for support with respect to handling personal data.

Science

There were 44 respondents from the Science faculty in the online survey. It took them a median of 5.7 minutes to complete it (ranging from 0.6 minutes to 5.1 days). In the graph below, the representation of each department within the Science faculty is visualised.

### FSBS There were 41 respondents from the Faculty of Social and Behavioural Sciences (FSBS) in the online survey. It took them a median of 8.9 minutes to complete it (ranging from 0.4 minutes to 0.3 days). In the graph below, the representation of each department within FSBS is visualised.

Humanities

There were 27 respondents from the Faculty of Humanities in the online survey. It took them a median of 8.5 minutes to complete it (ranging from 0.9 minutes to 1.3 days). In the graph below, the representation of each department within the Faculty of Humanities is visualised.

Veterinary Medicine

There were 23 respondents from the Faculty of Veterinary Medicine in the online survey. It took them a median of 10.4 minutes to complete it (ranging from 1.4 minutes to 0.2 days). In the graph below, the representation of each department within the Faculty of Veterinary Medicine is visualised.

### LEG There were 21 respondents from the Faculty of Law, Economics, and Governance (LEG) in the online survey. It took them a median of 10.1 minutes to complete it (ranging from 3.5 minutes to 0.9 days). In the graph below, the representation of each department within the Faculty of Law, Economics, and Governance (LEG) is visualised.

Geo

There were 27 respondents from the Faculty of Geosciences in the online survey. It took them a median of 7.1 minutes to complete it (ranging from 2.9 minutes to 0 days). In the graph below, the representation of each department within the Faculty of Geosciences is visualised.

One-on-one meetings

Besides the online survey, we organised one-on-one meetings with researchers, to hear about their personal experiences, challenges and needs concerning the handling of personal data in their research. Survey respondents could voluntarily leave their email address at the end of the survey to be contacted by us. These meetings were semi-structured and revolved around the following questions:

  • What made you leave your email address in the Data Privacy Survey? Related: What are your general experiences in handling personal data in research?
  • Which difficulties do you run into when handling personal data?
  • What support would you need to help you handle personal data in your research?
  • Do you have a concrete need for support at the moment?

All of the one-on-one meetings were conducted online and took approximately 30 minutes. During the meetings, one of the project coordinators (DH) was always present, together with either the other project coordinator (NM) or the relevant faculty privacy officer. Before the privacy officer was invited to the meeting, the researcher’s consent to do so was always obtained first.

UU-wide

From the survey respondents, 40 researchers left their email address to be contacted. Of those, 28 researchers indicated that they were willing to meet with us. Below, the division over faculties can be seen for all interviewees. Notably, the distribution seemed to mirror the faculty distribution in the entire survey relatively well.

Science

From the survey respondents, 7 researchers from the Science faculty left their email address to be contacted, although not all researchers indicated to be willing to meet with us when contacted. So far, 7 online meetings have been conducted, each with the project coordinators. Below, the division over positions can be seen for all interviewees from the Science faculty.

FSBS

From the survey respondents, 9 researchers from the faculty of Social and Behavioural Sciences (FSBS) left their email address to be contacted, although not all researchers indicated to be willing to meet with us when contacted. So far, 6 online meetings have been conducted, most of them with a privacy officer present. Below, the division over positions can be seen for all interviewees from the faculty of Social and Behavioural Sciences (FSBS).

Humanities

From the survey respondents, 7 researchers from the Humanities faculty left their email address to be contacted, although not all researchers indicated to be willing to meet with us when contacted. So far, 5 online meetings have been conducted, all of them with a privacy officer present. Below, the division over positions can be seen for all interviewees from the Humanities faculty.

Veterinary Medicine

From the survey respondents, 7 researchers from the faculty of Veterinary Medicine left their email address to be contacted, although not all researchers indicated to be willing to meet with us when contacted. So far, 5 online meetings have been conducted, most of them with a privacy officer present. Below, the division over positions can be seen for all interviewees from the faculty of Veterinary Medicine.

LEG

From the survey respondents, 7 researchers from the faculty of Law, Economics, and Governance (LEG) left their email address to be contacted, although not all researchers indicated to be willing to meet with us when contacted. So far, 2 online meetings have been conducted, most of them with a privacy officer present. Below, the division over positions can be seen for all interviewees from the faculty of Law, Economics, and Governance (LEG).

Geo

From the survey respondents, 1 researchers from the Geosciences faculty left their email address to be contacted, although not all researchers indicated to be willing to meet with us when contacted. So far, 1 online meetings have been conducted. Below, the division over positions can be seen for all interviewees from the Geosciences faculty.

Analysis

From the raw Qualtrics output, we first cleaned and split the data into different data files (cleaned and closed survey responses, open text responses, email addresses, see the pseudonymise-data.R script for details). Both the open text responses and the notes made during the one-on-one meetings were separately and manually scored to enable the extraction of action points.

Below we report on the descriptive statistics or summaries from most of the survey questions. As we did not formulate hypotheses, no statistical analyses were performed.

Data and material availability

All survey-related documentation can be found in the dedicated survey repository. It contains:

As the dataset contains personal information (demographic information, open text responses, email addresses, etc.), and no consent was obtained to share those details, we are unable to share them in this repository.

Results

Types of research data

UU-wide

To investigate the types of research that was represented in the sample, we asked which types of data, and specifically which types of personal data the respondents worked with. Most respondents indicated to use tabular, textual, code and audio data as their primary research data format. In terms of personal data types, contact information, demographic information and direct identifiers were most common among the respondents. This can be either because these are the types of personal data that are indeed most common, but possibly also because researchers mostly recognise these types of data as being personal data.

Types of data used
Datatype Frequency
Tabular data 125 (71%)
Textual data 107 (60.8%)
Code/theoretical models 60 (34.1%)
Audio data 56 (31.8%)
Video data 39 (22.2%)
Images 38 (21.6%)
Physiological measurements 28 (15.9%)
Bio-medical samples and data 28 (15.9%)
Geographical data 25 (14.2%)
Physical samples 12 (6.8%)
Other 8 (4.5%)
Types of personal data used
Personal Datatype Frequency
Demographic information 102 (58%)
Contact information 82 (46.6%)
Direct identifiers 66 (37.5%)
Sensitive demographic information 36 (20.5%)
Human behaviour 32 (18.2%)
Derived personal data 30 (17%)
Health/physical information 26 (14.8%)
None 18 (10.2%)
Other 11 (6.2%)
Sensitive direct identifiers 9 (5.1%)


When comparing faculties, it appears that most personal data is processed in the faculties of Social and Behavioural Sciences (FSBS), although the faculties of Science and Veterinary Medicine also seem to process quite some personal data. In the tabs, we look further into the types of (personal) data processed within each faculty.

Science

As with the university-wide data, researchers from the Science faculty indicated that they also most often used tabular data, textual data and code/theoretical models in their research. The same goes for the types of personal data: the same top-3 types are used here as in the entire university (demographic information, contact information, direct identifiers).

Types of data used
Datatype Frequency
Tabular data 34 (77.3%)
Textual data 27 (61.4%)
Code/theoretical models 24 (54.5%)
Images 13 (29.5%)
Video data 9 (20.5%)
Bio-medical samples and data 8 (18.2%)
Audio data 8 (18.2%)
Physical samples 6 (13.6%)
Physiological measurements 5 (11.4%)
Geographical data 4 (9.1%)
Other 2 (4.5%)
Types of personal data used
Personal Datatype Frequency
Demographic information 17 (9.7%)
Direct identifiers 13 (7.4%)
Contact information 13 (7.4%)
None 12 (6.8%)
Derived personal data 11 (6.2%)
Human behaviour 9 (5.1%)
Health/physical information 8 (4.5%)
Sensitive direct identifiers 4 (2.3%)
Sensitive demographic information 4 (2.3%)
Other 2 (1.1%)


As a large part of the researchers from the Science faculty indicated not to use any personal data whatsoever in their research, it may be useful to investigate in which departments these researchers work. In the figure below, the division of personal data types per Science department is plotted. As can be expected, the departments of Biology, Physics and Mathematics do not seem to process much personal data, whereas in the Information and Computing Sciences and Pharmaceutical Sciences departments, the most personal data is processed. This makes sense considering the types of research performed in these departments.

FSBS

As with the university-wide data, researchers from the faculty of Social and Behavioural Sciences (FSBS) indicated that they also most often used tabular data, textual data and code/theoretical models in their research. The same goes for the types of personal data: demographic and contact information are most often used. Interestingly, sensitive demographic information is used much more relative to the UU-wide responses.

Types of data used
Datatype Frequency
Tabular data 33 (80.5%)
Textual data 18 (43.9%)
Code/theoretical models 12 (29.3%)
Audio data 12 (29.3%)
Video data 10 (24.4%)
Physiological measurements 8 (19.5%)
Geographical data 6 (14.6%)
Other 4 (9.8%)
Bio-medical samples and data 4 (9.8%)
Images 3 (7.3%)
Physical samples 1 (2.4%)
Types of personal data used
Personal Datatype Frequency
Demographic information 32 (18.2%)
Contact information 20 (11.4%)
Sensitive demographic information 16 (9.1%)
Direct identifiers 14 (8%)
Human behaviour 8 (4.5%)
Derived personal data 8 (4.5%)
Health/physical information 6 (3.4%)
Other 3 (1.7%)
Sensitive direct identifiers 1 (0.6%)


When comparing the types of personal data across departments of the faculty, the departments of Sociology and Cultural Anthropology seem to process the least amount of personal data - although these departments may also simply be underrepresented in the survey sample. Most other departments process quite some personal data, as can be expected from a faculty of which the research focuses almost exclusively on humans.

Humanities

As with the university-wide data, researchers from the faculty of Humanities indicated that they also most frequently used tabular data, textual data and audio data in their research. The same goes for the types of personal data: the same top-3 types are present here as in the entire university (demographic information, contact information, direct identifiers).

Types of data used
Datatype Frequency
Textual data 18 (66.7%)
Tabular data 13 (48.1%)
Audio data 12 (44.4%)
Video data 6 (22.2%)
Images 5 (18.5%)
Code/theoretical models 4 (14.8%)
Physiological measurements 2 (7.4%)
Bio-medical samples and data 2 (7.4%)
Types of personal data used
Personal Datatype Frequency
Demographic information 14 (8%)
Direct identifiers 11 (6.2%)
Contact information 10 (5.7%)
Human behaviour 5 (2.8%)
Sensitive demographic information 4 (2.3%)
Other 2 (1.1%)
None 2 (1.1%)
Derived personal data 2 (1.1%)
Sensitive direct identifiers 1 (0.6%)
Health/physical information 1 (0.6%)


If we look at the different departments within the faculty, it is clear that mostly the departments of Languages, Literature and Communication, and of Media and Culture Studies process personal data in their research. Especially researchers at the department of Languages, Literature and Communication seem to process a lot of personal data, which makes sense considering the type of research performed there.

Veterinary Medicine

When comparing the types of data that researchers at the Faculty of Veterinary Medicine work with, it is clear that researchers at this faculty work with slightly different types of data, i.e. biological and physiological data are more common among these researchers. Despite this, the most frequent types of personal data do correspond well to the university-wide numbers: demographic information, direct identifiers and contact information are also the most common types of personal data researchers at this faculty seem to deal with.

Types of data used
Datatype Frequency
Tabular data 20 (87%)
Textual data 13 (56.5%)
Physiological measurements 13 (56.5%)
Bio-medical samples and data 13 (56.5%)
Images 10 (43.5%)
Code/theoretical models 10 (43.5%)
Video data 9 (39.1%)
Geographical data 6 (26.1%)
Physical samples 4 (17.4%)
Audio data 4 (17.4%)
Other 1 (4.3%)
Types of personal data used
Personal Datatype Frequency
Contact information 16 (9.1%)
Demographic information 15 (8.5%)
Direct identifiers 11 (6.2%)
Health/physical information 8 (4.5%)
Derived personal data 5 (2.8%)
Human behaviour 3 (1.7%)
Sensitive direct identifiers 2 (1.1%)
Sensitive demographic information 2 (1.1%)
None 2 (1.1%)
Other 1 (0.6%)


When looking at each department separately, it is clear that researchers in the department of Population Health Sciences process the large majority of the personal data within the faculty. Of course, it may also be possible that this department was simply more represented in the survey than the other departments.

LEG

As with the university-wide data, researchers from the faculty of Law, Economics and Governance (LEG) indicated that they also most frequently used tabular data, textual data and audio data in their research. The same goes for the types of personal data: the same top-3 types are present here as in the entire university (demographic information, contact information, direct identifiers).

Types of data used
Datatype Frequency
Textual data 17 (81%)
Tabular data 13 (61.9%)
Audio data 12 (57.1%)
Images 3 (14.3%)
Video data 2 (9.5%)
Geographical data 2 (9.5%)
Code/theoretical models 1 (4.8%)
Types of personal data used
Personal Datatype Frequency
Direct identifiers 14 (8%)
Demographic information 14 (8%)
Contact information 14 (8%)
Sensitive demographic information 8 (4.5%)
Human behaviour 3 (1.7%)
Health/physical information 3 (1.7%)
Derived personal data 2 (1.1%)


When looking at the different departments within the faculty, researchers in the department of Governance seem to process personal data the most often, followed by the department of Law. Of course, it is possible that the other departments process more personal data than the respondents indicated.

Geo

As with the university-wide data, researchers from the faculty of Geosciences indicated that they also most frequently used textual and tabular data. However, in contrast to the university-wide data, they are followed by geographical data and code/theoretical models. This is to be expected considering the faculty. In terms of personal data, however, the same top-3 types are present here as in the entire university (demographic information, contact information, direct identifiers).

Types of data used
Datatype Frequency
Textual data 11 (78.6%)
Tabular data 10 (71.4%)
Geographical data 7 (50%)
Code/theoretical models 7 (50%)
Audio data 4 (28.6%)
Video data 2 (14.3%)
Physical samples 2 (14.3%)
Images 2 (14.3%)
Types of personal data used
Personal Datatype Frequency
Contact information 9 (5.1%)
Demographic information 7 (4%)
Direct identifiers 4 (2.3%)
None 2 (1.1%)
Human behaviour 2 (1.1%)
Derived personal data 2 (1.1%)
Sensitive direct identifiers 1 (0.6%)
Sensitive demographic information 1 (0.6%)
Other 1 (0.6%)


When comparing departments, it becomes immediately clear that most personal data seems to be processed in the department of Sustainable Development, and some also in the Human Geography and Spatial Planning department. This makes sense, as the research performed at the third department (Earth sciences) usually does not focus on human behaviour and thus does not involve much personal data, if any.

Current practices

The first part of the survey addressed the researchers’ current practices in handling personal data in their research.

Protective measures

UU-wide

With respect to organisational and technical measures used to handle personal data, most respondents indicated that they pseudonymise/anonymise their data. As this question is self-reported, we cannot assess whether the researchers’ data was actually sufficiently pseudonymised/anonymised in line with the GDPR. Secondly, many researchers seemed to implement access control and encryption, and complete a Data Management Plan (DMP) during their project(s). This is to be expected, as most funders nowadays require a DMP and many DMP templates explicitly address topics like pseudonymisation, access control and encryption. On the other hand, GDPR-specific assessments such as Data Protection Impact Assessments (DPIAs) or privacy reviews were least used - as it stands, these assessments are still only carried out on a case-by-case basis.

Science

Protective measures for Science faculty

FBSB

Protective measures for the Faculty of Social and Behavioural Sciences (FSBS)

Humanities

Protective measures for the Faculty of Humanities.

Veterinary Medicine

Protective measures for the Faculty of Veterinary Medicine.

LEG

Protective measures for the Faculty of Law, Economics and Governance.

Geo

Protective measures for the Faculty of Geosciences.

Storage media

UU-wide

When working with personal data, it is important to choose a sufficiently secure storage medium. Luckily, most respondents indicated to rely on storage solutions that are provided or recommended by UU - and in most cases are indeed safe for storing personal data. Nonetheless, some respondents indicated to use non-UU solutions, including cloud solutions that UU advises against using. Several respondents also indicated that they use other storage solutions, such as those of external institutions (e.g., University Medical Center, Trimbos Institute, Central Bureau of Statistics) and repositories (e.g., DANS EASY, CLARIAH).

Science

Storage media used in the Science faculty.

FBSB

Storage media used in the Faculty of Social and Behavioural Sciences.

Humanities

Storage media used in the Faculty of Humanities.

Veterinary Medicine

Storage media used in the Faculty of Veterinary Medicine.

LEG

Storage media used in the Faculty of Law, Economics and Governance.

Geo

Storage media used in the Faculty of Geosciences.

Data Protection Impact Assessment (DPIA)

UU-wide

A Data Protection Impact Assessment (DPIA) is a legal instrument to assess the risks involved for data subjects, and helps determine the necessary safeguards to reduce those risks to an acceptable level. Despite it being an important legal instrument, most respondents indicated to never have carried one out, or heard of it, for that matter. A minority of the sample had heard of it, or has completed one. Currrently, the desired scenario is that researchers get help from a privacy officer when performing a DPIA, and the results seem to indicate that this is indeed the case in the majority of cases.

Science

Below the experience and help received with DPIAs is displayed for the Faculty of Science.

FBSB

Below the experience and help received with DPIAs is displayed for the Faculty of Social and Behavioural Sciences (FSBS).

Humanities

Below the experience and help received with DPIAs is displayed for the Faculty of Humanities.

Veterinary Medicine

Below the experience and help received with DPIAs is displayed for the Faculty of Veterinary Medicine.

LEG

Below the experience and help received with DPIAs is displayed for the Faculty of Law, Economics and Governance.

Geo

Below the experience and help received with DPIAs is displayed for the Faculty of Geosciences.

Data sharing practices

UU-wide

To investigate how often data are being shared and under which circumstances, we asked respondents if and with which parties they typically share their research data, and which measures they usually take to do so securely. While there were some respondents who indicated not to share their data at all, most respondents seem to only share research data within the organisation, and otherwise within the European Economic Area (EEA). This is relatively good news, as such transfers usually require little, if any, additional safeguards. Responses in the “Other” category, however, suggest that a lot of data actually are shared, for example with co-authors at another institution, students, or in “pseudonymised form”.

Concerning the measures used before sharing data, most repondents indicated to pseudonymise their data, though we cannot assess the quality of such pseudonymisation. Researchers also seemed to use approved tools, agreements, and providing access without data transfer. Only a few respondents indicated that they involved a data expert while transferring data, and the use of Standard Contractual Clauses appears limited.

Science

Below the data sharing practices across the Faculty of Science are visualised.

FBSB

Below the data sharing practices across the Faculty of Social and Behavioural Sciences (FSBS) are visualised.

Humanities

Below the data sharing practices across the Faculty of Humanities are visualised.

Veterinary Medicine

Below the data sharing practices across the Faculty of Veterinary Sciences are visualised.

LEG

Below the data sharing practices across the Faculty of Law, Economics and Governance are visualised.

Geo

Below the data sharing practices across the Faculty of Geosciences are visualised.

Data publishing

UU-wide

Open science and privacy are often seen as conflicting, as sharing personal data cannot be done just like that, but requires at the very least a valid legal basis and additional safeguards. Therefore, in our experience, to date not many datasets that contain personal data are shared for reuse purposes, and the survey respondents seemed to confirm this experience, as the majority of the respondents indicated not to publish their data, or only in anonymised form (again, we cannot be certain whether the data were indeed entirely anonymised).

The primary reason for not publishing data appears to be that researchers were still working on the data, they don’t want to/need to publish their data, or they cannot anonymise the data. Other reasons given by the respondents included that publishing data is too much effort, publication is undesirable, or just not considered.

Science

For the Faculty of Science, no respondents indicated to publish datasets, only metadata if applicable. Therefore, there were no respondents who filled out the question about the data format in which the data were published.

FBSB

Below you can see for the Faculty of Social and Behavioural Sciences (FSBS) in which format they publish their data, if they do (left), or which reasons researchers indicated to have to not publish data.

Humanities

Below you can see for the Faculty of Humanities in which format they publish their data, if they do (left), or which reasons researchers indicated to have to not publish data.

Veterinary Medicine

For the Faculty of Veterinary Medicine, no respondents indicated to publish datasets, only metadata if applicable. Therefore, there were no respondents who filled out the question about the data format in which the data were published.

LEG

Below you can see for the Faculty of Law, Economics and Governance in which format they publish their data, if they do (left), or which reasons researchers indicated to have to not publish data.

Geo

For the Faculty of Geosciences, no respondents indicated to publish datasets, only metadata if applicable. Therefore, there were no respondents who filled out the question about the data format in which the data were published.

Existing support channels

The second part of the survey concerned the visibility and use of existing support channels.

Faculty privacy officer

When asked whether respondents know who their faculty privacy officer is, a little over half of the respondents indicated that they do not (Yes: 64 (48%), No: 69 (52%)).

When comparing faculties (see below), it is striking that the faculties where the most personal data seems to be processed, a small majority of respondents is not aware of their faculty privacy officer. This suggests either that researchers have simply never required help from their privacy officer, or that the faculty privacy officers could increase their visibility within their faculties.

Looking for help

UU-wide

When asked whether respondents have ever looked for support in the form of information, tools, or in-person support, an overwhelming majority indicated that they have, as can be seen in the graph below. Most respondents that looked for support indicated, however, that they haven’t always found the support they were looking for. Together with the results from the previous question, this suggests that the visibility of the current support channels could be improved. Note however that there are some differences between faculties (see the different tabs).

Science

Below are the results for the Faculty of Science when asked whether researchers had looked for information, support or tools in handling personal data, and whether they had found what they were looking for:

FBSB

Below are the results for the Faculty of Social and Behavioural Sciences (FSBS) when asked whether researchers had looked for information, support or tools in handling personal data, and whether they had found what they were looking for:

Humanities

Below are the results for the Faculty of Humanities when asked whether researchers had looked for information, support or tools in handling personal data, and whether they had found what they were looking for:

Veterinary Medicine

Below are the results for the Faculty of Veterinary Medicine when asked whether researchers had looked for information, support or tools in handling personal data, and whether they had found what they were looking for:

LEG

Below are the results for the Faculty of Law, Economics and Governance (LEG) when asked whether researchers had looked for information, support or tools in handling personal data, and whether they had found what they were looking for:

Geo

Below are the results for the Faculty of Geosciences when asked whether researchers had looked for information, support or tools in handling personal data, and whether they had found what they were looking for:

Channels used to find support

UU-wide

The graph below indicates which channels respondents used most when looking for information about handling personal data. As shown below, all sources mentioned were to some extent consulted by the respondents. However, there is large differentiation between these sources. The university website and intranet are the most visited online resources for information about handling personal data. Notably, colleagues appear to play an important role as well in informing researchers about how to handle personal data. This suggests that the better informed researchers are, the more positive the effect is on their colleagues as well. Moreover, this also suggests that in-person support may be a more effective way of increasing awareness of privacy-related practices than more “distant” information sources.

Science

Below you can find the channels used by researchers from the Faculty of Science to find information about handling personal data:

FBSB

Below you can find the channels used by researchers from the Faculty of Social and Behavioural Sciences (FSBS) to find information about handling personal data:

Humanities

Below you can find the channels used by researchers from the Faculty of Humanities to find information about handling personal data:

Veterinary Medicine

Below you can find the channels used by researchers from the Faculty of Veterinary Medicine to find information about handling personal data:

LEG

Below you can find the channels used by researchers from the Faculty of Law, Economics and Governance (LEG) to find information about handling personal data:

Geo

Below you can find the channels used by researchers from the Faculty of Geosciences to find information about handling personal data:

Challenges and needs (survey)

UU-wide

As can be seen above, most researchers experience privacy to be an obstacle for open science and research data management in some way. It is therefore important to aim for support in this area. What this support should look like, however, differs a bit. As can be seen below, accessible information and visible support channels seem to be the most wanted improvements in the current support, closely followed by UU-wide policy on the topic, and privacy-related walk-in hours.

Science

Below you can find the preferred ways of improving privacy-related support as indicated by researchers from the Faculty of Science:

FSBS

Below you can find the preferred ways of improving privacy-related support as indicated by researchers from the Faculty of Social and Behavioural Sciences (FSBS):

Humanities

Below you can find the preferred ways of improving privacy-related support as indicated by researchers from the Faculty of Humanities:

Veterinary Medicine

Below you can find the preferred ways of improving privacy-related support as indicated by researchers from the Faculty of Veterinary Medicine:

LEG

Below you can find the preferred ways of improving privacy-related support as indicated by researchers from the Faculty of Law, Economics and Governance (LEG):

Geo

Below you can find the preferred ways of improving privacy-related support as indicated by researchers from the Faculty of Geosciences:

Challenges and needs (open questions, meetings)

As mentioned in the Methods section, the responses on the following open questions, and the notes taken during the one-on-one meetings with researchers were coded to allow for easier analysis. From the survey, the following questions were coded:

  • “Which challenges concerning the handling of personal data do you run into most often?”
  • “What specific information or tools about handling personal data are you missing from existing sources?”
  • “What can we do better to support you in handling personal data in research?” (responses to the “Other” option)

The codes that were assigned in both the survey and the meeting notes can be seen in the word cloud below: the larger the font, the more often the code was assigned. The meaning of each code is explained in the file “codes-freq_survey_meetings.csv” and below. Please note that there is overlap in respondents between the open questions in the survey and the meeting notes, and therefore some codes may have been applied twice for the same researcher.

UU-wide

Science

Here are results per faculty

FBSB

dasd

Humanities

asasd

Veterinary Medicine

adasds

LEG

asasd

Geo

asd

Below we go into the most frequently mentioned challenges/needs expressed by researchers in the survey (open questions) and one-one-one meetings:

Visibility and findability

It should be clear where to go for help (to whom or which webpage, etc.) (mentioned 27 times). Some respondents indicated not to know where to go for help with privacy-related matters, others mentioned that there should be more support personnel available (8). One researcher said to prefer having one place to go to for all data-related questions. As a matter of fact, many faculties have such a “one-stop-shop” in place, but responses such as these indicate that their visibility may be improved.

Closely related to visibility is findability. Many researchers point out that available information is difficult to find, or it is confusing which source should be followed. As an example, one researcher from the faculty of Humanities was determined that they found some useful information at the RDM Support website. However, upon closer inspection, it appeared that this information arose from another data management-related website with the UU logo on it.

“UU biedt ontzettend veel aan, maar je moet veel websites bezoeken om alles te vinden […] Een duidelijk overzicht voor welke informatie je waar moet zijn zou fijn zijn.”

Specific tools

Some (26) researchers expressed a need for a concrete (improvement of a) tool, such as for:

  • (mentioned

times) - (mentioned

times) - (mentioned

times)

Hands-on support

Many (24) researchers indicated that support staff could sometimes provide support in a more hands-on fashion, rather than abstract advice and telling researchers how not to do things. Some researchers added that privacy professionals sometimes have a tendency to cling to the letter of the law, leading to significant delays in their project, instead of looking at how to concretely solve existing issues in practice (mentioned 11 times):

“Soms zijn we door deze regels heiliger dan de paus.”
“Actual getting-your-hands-dirty support: not the kind that tells you what to do, but also the kind that helps you by doing.”

Less bureaucracy

Processes were often experienced as time-inefficient, and sometimes longer and more bureaucratic than necessary (mentioned by 22 researchers). For example, the DPIA process was mentioned explicitly (10 times), as well as having to fill out too many forms with overlapping content (e.g., Privacy Scan, Data Management Plan, DPIA). Some (9) researchers argued that (part of) this burden should be relieved or carried by support staff:

“Minder acties die gericht zijn op inhoudelijk trainen van WP en meer uit handen nemen van deze groep.”
“Sharing data costs a lot of time and is inefficient when you do not do it often”

Unclear processes and guidelines

Many (18) researchers complained that it was unclear what was expected of them when they processed personal data in their research, or that they would like to have more, or more practical guidelines (14) on this topic, for example on:

  • What steps do researchers need to take? (3)
  • Who should researcher ask for help in which situation?
  • Who is responsible or has the authority to make the right decisions when handling personal data? Do privacy officers/DPO have to be seen as gatekeepers (similar to an ethical committee) or as advisers? Do researchers have to listen to the advise, or is that their own responsibility? (4)
  • When rules change, how does that affect what researchers need to do? (1)
  • Different processes for different types of research (e.g., student projects vs. large longitudinal projects, 3)

Information and education

A large part of the respondents expressed a need for more (clear) information and education with respect to handling personal data in research. In general, researchers indicated that the information offered to them should be more clear (3), simpler 9, consistent across resources (8), possibly in the form of templates (14). Luckily, some researchers indicated to already be happy with existing materials (4).

Information for specific research or data types

Many researchers indicated to have a need for more tailored information for specific types of data or research (9), for example for etnographic data (2), historical data (2), or video data (3).

“De informatie is gewoon veel te generiek, er zouden templates moeten zijn per type onderzoek.”
“Veel templates en uitlegmodellen spreken over data, data packages en metadata, maar die woorden zijn niet ingebed in historisch onderzoek. Er ontstaat al snel verwarring over wat historici nu precies moeten met archiefmateriaal in het licht van privacy.”

Frequently asked questions

A selection of researchers used the space in the open questions and/or the meetings to ask knowledge-related questions. These can be used in future support efforts, such as the Data Privacy Handbook, future courses, or other future information campaigns to better address researchers’ needs. Below are examples of the most commonly asked questions:

  • When are data still personal (9), how to anonymise data (3), and when are data anonymised sufficiently (5)?
  • What are the privacy-related requirements for students? (7)
  • How to store different types of personal data? (6)
  • Data sharing: what data can be shared and with whom? (7)
  • How to find a balance between informing data subjects too little vs. providing too much privacy-related information that will scare them off or hurt their trust in research? (5)
  • How to balance open science and privacy? (5). Related: when is reusing data fir different purposes allowed? (e.g., education data, data collected by students, 4)
  • How to collaborate with multiple institutions? (4)

Education and onboarding

The need for more educational resources was also recognised by a selection of the respondents. Concretely, researchers could be educated more in the following ways:

  • Privacy and/or research data management as part of the master or PhD curriculum (8)
  • Privacy as part of the onboarding procedure of new employees (5)
  • Mandatory privacy training for supervisors, principal investigators, professors, and/or teachers (4)
  • A course on how to handle personal data in research (2)
“There is no one who tells at the start of your PhD how you should handle your data. […] I think new PhD students should get a basic course on data management and privacy.”

… or simply no issues

Notably, there were also researchers who indicated not to have run into issues (yet), or to have received sufficient and useful help (9). For example:

“The data manager and privacy officer of the faculty of humanities help a lot. This support is essential!”
“Tot nu toe heb ik niet veel problemen gehad. De institutional review board van onze afdeling kijkt altijd kritisch naar de onderzoeksvoorstellen, ook met name op omgaan met persoonsgegevens.”

Technical information

This document was last created on: 2022-09-27. It was created in the following environment:

## R version 4.2.0 (2022-04-22 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19042)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=Dutch_Netherlands.utf8  LC_CTYPE=Dutch_Netherlands.utf8   
## [3] LC_MONETARY=Dutch_Netherlands.utf8 LC_NUMERIC=C                      
## [5] LC_TIME=Dutch_Netherlands.utf8    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] wordcloud2_0.2.1  readxl_1.4.1      kableExtra_1.3.4  knitr_1.39       
##  [5] gridExtra_2.3     forcats_0.5.1     stringr_1.4.0     dplyr_1.0.9      
##  [9] purrr_0.3.4       readr_2.1.2       tidyr_1.2.0       tibble_3.1.8     
## [13] ggplot2_3.3.6     tidyverse_1.3.2   data.table_1.14.2
## 
## loaded via a namespace (and not attached):
##  [1] svglite_2.1.0       lubridate_1.8.0     assertthat_0.2.1   
##  [4] digest_0.6.29       utf8_1.2.2          R6_2.5.1           
##  [7] cellranger_1.1.0    backports_1.4.1     reprex_2.0.2       
## [10] evaluate_0.16       highr_0.9           httr_1.4.4         
## [13] pillar_1.8.0        rlang_1.0.4         googlesheets4_1.0.1
## [16] rstudioapi_0.13     jquerylib_0.1.4     rmarkdown_2.15     
## [19] labeling_0.4.2      webshot_0.5.3       googledrive_2.0.0  
## [22] htmlwidgets_1.5.4   munsell_0.5.0       broom_1.0.0        
## [25] compiler_4.2.0      modelr_0.1.8        xfun_0.32          
## [28] systemfonts_1.0.4   pkgconfig_2.0.3     htmltools_0.5.3    
## [31] tidyselect_1.1.2    viridisLite_0.4.0   fansi_1.0.3        
## [34] crayon_1.5.1        tzdb_0.3.0          dbplyr_2.2.1       
## [37] withr_2.5.0         grid_4.2.0          jsonlite_1.8.0     
## [40] gtable_0.3.0        lifecycle_1.0.1     DBI_1.1.3          
## [43] magrittr_2.0.3      scales_1.2.0        cli_3.3.0          
## [46] stringi_1.7.8       cachem_1.0.6        farver_2.1.1       
## [49] fs_1.5.2            xml2_1.3.3          bslib_0.4.0        
## [52] ellipsis_0.3.2      generics_0.1.3      vctrs_0.4.1        
## [55] tools_4.2.0         glue_1.6.2          hms_1.1.1          
## [58] fastmap_1.1.0       yaml_2.3.5          colorspace_2.0-3   
## [61] gargle_1.2.0        rvest_1.0.2         haven_2.5.0        
## [64] sass_0.4.2

  1. The Data Privacy Project is funded by Utrecht University’s Research IT program and a Digital Competence Center grant from the Dutch Organization for Scientific Research (NWO). It is led by Utrecht University’s Research Data Management Support, in collaboration with the University Library, Information Technology Services, first- and second-line privacy officers, information security, and others.↩︎